5 - Machine Learning for Physicists [ID:7971]
50 von 1075 angezeigt

The following content has been provided by the University of Erlangen-Nürnberg.

Okay, hello everyone, good evening. So last time we tried to fit a function or even an image using a neural network.

Today we will do something different, namely we turn to some of the more important applications of neural networks,

which is to recognize images. So the goal would be the one displayed in this picture.

You are shown an image and you have to recognize this image.

And for example here the network would try to tell you, oh this is Emmy Noether.

The question is how this is done. Now we will not really go for images of people,

because then I would have to have an image database of annotated images that tell me the name of the person.

But rather we turn to some of the most well-known training databases for image recognition,

and that is for handwriting recognition. So there is a data set available on the internet

that was compiled by the American National Institute for Standards and Technology.

This data set simply consists of digits 0 to 9 that were handwritten by many different employees

of some office and in addition by high school students.

So the idea for applications is of course that you would use this to recognize postal codes on an envelope

to be able to direct this to the right address. And the goal of the network is to be able to recognize this

as a 3 independently of how exactly it is written. And the goal of training the network is to show it

many, many of these examples on the order of 50,000, to train it on these examples,

and then to have a network that is able to recognize digits even if it has never seen them before.

So when we go through this example, we will learn several different things.

First, we will learn how neural networks learn to distinguish different categories.

These categories could be whether the image is a cat or a dog, or the name of the person,

but in our case the different categories are simply the 10 different digits, 0 to 9.

We will then learn that it's suitable for the network to predict maybe not just one precise category,

but rather a probability distribution where it is more likely, the network thinks it's more likely to have this category

or less likely to have the other one, and that leads us to a new kind of nonlinear function,

which is a generalization of the sigmoid function.

Then together with this nonlinear function comes a new cost function instead of our quadratic cost function

that we have talked about so far. Then we will also learn how to deal with these large training data sets,

and that it is often useful to split the total data set into different parts, one for training, one for so-called validation.

We will learn what that means, and one is the real test data to test the network on.

And then we will also discuss a common problem, which is overfitting,

which is when the network, so to speak, learns to the test. We will see what that means.

Okay, so here's the task in a nutshell. You present an image to the network,

and the network should spit out an answer, in this case the digit.

Now the image itself, as usual, is just a pixel image, say, with 28 by 28 input pixels

for this kind of training images in the MNIST database.

So there are 784 numbers, gray values between 0 and 1.

These will be the inputs to our neural network, which means the input layer has 784 neurons.

So this is much more than we dealt with so far.

Then this is fed through our neural network, and in the end we want the network to announce which number it thinks this is.

Now in principle we could have the possibility that there is only one output neuron,

and it directly gives me the number, 0, 1, 2, 3, up to 9.

But it turns out this is much more difficult to handle for the network.

And so it's much better to do it with more neurons, actually,

to have as many neurons as there are categories, in this case 10,

and to train the network to put a 1 into the correct spot, into the correct category, and a 0 everywhere else.

So this kind of encoding, which is of course not very memory efficient, but that doesn't matter here,

so this kind of encoding is called a one-hot encoding, because only one of these is turned on, that's hot, and the rest is off.

Okay, now this still has some problems, because you know it's discrete, it's binary, it's only 0 or 1, it's integer values, so to speak.

And that's not very good for training a neural network.

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:28:26 Min

Aufnahmedatum

2017-06-08

Hochgeladen am

2017-06-20 07:40:11

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen